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Abstract 

Many real life problems can be reduced to the solution of a complex expo- 
nentials approximation problem which is usually ill posed. Recently a new 
transform for solving this problem, formulated as a specific moments prob- 
lem in the plane, has been proposed in a theoretical framework. In this work 
some computational issues are addressed to make this new tool useful in 
practice. An algorithm is developed and used to solve a Nuclear Magnetic 
Resonance spectrometry problem, two time series interpolation and extrap- 
olation problems and a shape from moments problem. 
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Introduction 

Many signal processing problems (see e.g. [23]) can be formulated as a com- 
plex exponential interpolation problem (CEIP): given the complex numbers 
Sfc, k — 0, 1, 2, . . . 2p — 1, to find complex numbers {cj, j = 1, . . . ,p such 
that 

p 

s * = I>i^' k = 0,l,...,2p-l. (1) 

or, equivalently ^5\, to find poles £j and corresponding residues = Cj/£j 
of the rational function s(^) whose first 2p Taylor coefficients at z = are 
Sfc, /c = 0, 1, 2, . . . 2p — 1. The problem can be restated as a generalized 
eigenvalue problem as follows. Let us consider Hankel matrices Uo(s) and 
Ui(s) given by 
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where s = [s , . . . , s 2p -i]. Because of fll]), the following factorizations hold 

U (s) = VCV T , U^s) = VCZV T 
where V is the Vandermonde matrix based on • • • , £p), 

C = diag{ci, . . . , c p } and Z = diag^, . . . , £ p }. 
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Therefore j — 1, . . . , p) are the generalized eigenvalues of the pencil P = 
[Ui(s), U (s)} and (&,-, j = 1, . . . ,p) are related to the generalized eigenvector 
Uj = V~ T e_j of P by Cj = uJ[so, . . . , s p _i] T , where is the j— th column of 
the identity matrix J p of order p. A further equivalent formulation is based 
on the complex measure 

v 

where D is a compact subset of W and 5 is the Dirac distribution. It turns 
out that for k — 0, 1, 2, . . . 

Sk — z k S(z)dz = (x + iy) k S(x + iy)dxdy. 

Jd J Jd 

Therefore Sk is the fc-th harmonic moment of the measure S and the com- 
plex exponential interpolation problem is equivalent to a specific moment 
problem in the plane consisting in retrieving the distribution S from Sk, k = 

0. 1.2, ...2p — 1. Conditions for existence and unicity of the solution are 
detU (s) ^ 0,de*l7i(s) ^ (see e.g. [151 Th.7.2c]). 

More realistically, by denoting in bold all random quantities, let us consider 
the discrete stochastic process defined by 

&k = Sk + k = 0, 1, 2, . . . , n - 1 (2) 

where n > 2p and is a complex Gaussian zero-mean white noise discrete 
process with known variance a 2 . We want therefore to solve the complex 
exponential approximation problem (CEAP) consisting in estimating p and 
{cj, j = 1, . . . ,p, from a realization afe, k = 0, 1, 2, . . . , n — 1 of a/-. This 
is equivalent, when p is known, to solve a Pade' approximation problem 

1. e. to compute the [p,p — 1] Pade' approximant of the formal power series 
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f( z ) = Y2k a k z ~ k y or to solving a generalized eigenvalue problem for non- 
square pencils [TTl ITj or a specific noisy moments problem in the plane. Even 
if p were known the problem would be quite difficult and usually ill-posed. 
A wide literature exists on the subject. We can summarize some well known 
facts as follows (see e.g. [Hi [Hi [K)] ) • The problem is optimally conditioned 
when £j are equispaced on the unit circle. In this case in fact model ([1]) 
reduces to the Fourier model which is an orthogonal one. Clusters of £j 
are more difficult to estimate than well separated ones. Complex exponen- 
tials with relatively small \cj\ are more difficult to estimate than those with 
relatively large weight. 

Recently a new approach for solving the complex exponential approxima- 
tion problem in a stochastic framework was proposed pQ, which exploits the 
relation with generalized eigenvalue problems and with moments problems 
outlined above but without assuming to know p. It makes use of tools from 
the theory of logarithmic potential with external fields [22] and the theory 
of random polynomials [Si EH] and provides an estimate of p and point and 
interval estimates of {cj,£j}. 

In this work some computational and numerical issues are addressed to 
make this new tool useful in practice. An algorithm is developed and tested 
on well known difficult problems. 

The paper is organized as follows. In Section 1 the method introduced in 
[1] is shortly summarized. In Section 2 the proposed algorithm is discussed. 
In Section 3 the algorithm is used to solve a Nuclear Magnetic Resonance 
spectrometry problem, a time series interpolation and extrapolation prob- 
lem and a shape from moments problem providing some comparisons with 
existing methods. 
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1 The new transform 

Starting from a&, k = 0, 1, 2, . . . , n — 1, assuming n even, let us consider the 
stochastic CEIP (i.e. a CEIP for each realization of {a^}) 

n/2 

a fc = Zl c ^i' fc = 0,l,...,n-l.. (3) 

i=i 

and the associated random measure 

n/2 

S n (z,a) = -*,■)■ (4) 

j'=i 

Let us also define the random Hankel | x | matrices t/o(a), f^i(a), where 
a = [a , . . . , a^-i]. The generalized eigenvalues ^ •, j = 1, . . . , n/2 of the 
random pencil P = [t/i(a), f/ (a)] satisfy the equation 

Pn/2(«) = de*[E/i(a) - zt/b(a)] = 

where p n / 2 (-z) is a random polynomial. We can then consider the expected 
value of the (random) normalized counting measure on the zeros j = 
1, . . . , n/2 of this polynomial (condensed density, [HI [5]): 



fr n (z) = —E 
n 



n/2 



In [2] it was proved that, when s = 0, in the limit for n — > oo the condensed 
density is a distribution supported on the unit circle and it can be proved 
(PP) that in the limit for o — >• oo the generalized eigenvalues £• tend to 
concentrate on the unit circle and, in the limit for o — > 0, they concentrate 
around the true £j,j = l,...,p. It is therefore evident that in order to solve 
CEAP, the first issue to address is the identifiability one. If the Signal-to 
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Noise ratio (SNR) is not large enough with respect to the signal structure 
as discussed in the introduction, there is no hope to solve CEAP. The first 
step of the method introduced in p] provides a tool for assessing if CEAP is 
solvable based on the properties of the condensed density of the generalized 
eigenvalues h n (z). More precisely we give the following: 

Definition 1 The measure S(z) is identifiable from a/-, k = 0, . . . , n — 1 if 
3 r& > 0, k — 1, . . . ,p such that 

• h n (z) is unimodal in iV& = {z || \z — < r^} 



The following result, proved in [TJ, gives the relation between S n (z,cx), and 
the unknown measure >S'(^) 

Theorem 1 If S(z) is identifiable from a then 



As in the limit for a — >• 0, the condensed density tends continuously to a dis- 
tribution supported on the true j — 1, . . . ,p, it does exist a small enough 
to make S(z) identifiable from a and in this case we can use the random mea- 
sure S n (^, a) to estimate S(z) by using Theorem [TJ To perform this program 
we need two steps. The first one consists in either to check the ident inability 
of the measure S(z) from a or to properly design the experiment (i.e. to 




E[S n (z, cr))dz = c h + o(cr), h = l,...,p 



and 




j 
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choose n and a) in order to get identifiability. The second step consists in 
building an estimator of S n (z, a). 

About the first step we notice that of course the function h n (z) cannot be 
computed because we do not know s i.e. the mean of a. However, assuming to 
know s, we can use h n (z) to state whether S(z) is identifiable from the data. 
Unfortunately even in the Gaussian assumption the analytic computation of 
h n (z) is hard. However it can be approximated ([I]) by 

K{z) = lo ?>M z )) 

fj,j(z)>0 

where A is the Laplacian operator acting on z and Hj(z) are the eigenvalues 
of 

'2 



na 



where s 



A(z,z 



(U^s) - zf/ U))(tAU) - zU (s)) + —A(z,z) 

[sq, . . . , S n _x], 

l + l^l 2 -z ... 
l + \z\ 2 -, 



(5) 



— z 















-z l + l^l 2 
and overline denotes conjugation. 

Remark. From equation ((5]) it follows that n should not be as large as possible 
to get the best estimates of S(z). In fact too many data will convey too much 
noise which could mask the signal 

We have therefore a tool either to check identifiability or to design properly 
the experiment. In most real problems we have some prior information about 
the unknown measure S(z). We can then compute h n (z) for several candidate 
measures compatible with our prior information and choose values n and a 
that make the candidate measures identifiable. 
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We now move to the second step of the procedure consisting in estimating 
the random measure S n (z, a) and extracting from it the required information. 
If we have R samples from the data discrete stochastic process a we can 
estimate E[S n (z, a)] by solving CEIP for each sample or***, r = 1, ...,R, 
i.e. finding j = l,...,n/2, such that a« = Ejficf (#>)*, fc = 

0, 1, . . . , n — 1 and then taking the sample mean 

1 R n/2 
r=l 3=1 

If only one sample is available we can use the following method proposed 
in pp. We notice first that in order to cope with the Dirac distribution 
appearing in the definition of S n (z, a), it is convenient to use an alternative 
expression given by (see [2]) 

n/2 

S n (z,a) = — A^cjlogflz-^f). 

Then we build independent replications of the data process (pseudosamples) 
by defining 

a i — afc + ^fcj fc = 0, ...,n — 1; r = l,...,i? 

where {i 7 ^} are i.i.d. zero mean complex Gaussian variables with variance 
cr' 2 and therefore ajjT^ have variance a 2 = a 2 + a' 2 . We then define the 
estimator, conditioned to a 

/ R n/2 \ 

8U*>*) = ^ EE c fMk-^l) (6) 

\r=l 3=1 J 

where (cj \ — 1> • • • , n /2 are the solution of CEIP for the pseudodata 

(r) 

a k , k = 0, . . . ,n — 1 which are computable by a MonteCarlo procedure 
given a. In [1] the following theorem is proved 
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Theorem 2 Let M(z) and M c (z) be the mean squared error ofS n (z,o~) and 
S c nR (z,a) respectively. In the limit for a — > 0, it exists a' and R{o~') such 
that WR > R(a'), M c (z) < M(z) Vz. 

In order to estimate (cj,£j), j = l,...,p, we make use of Theorem [TJ In 
fact, if S(z) is identifiable, there exist disjoint sets N k ,k = l,...,p such 



that 



S Nk E [ S n(z,cr)]dz 



^> cr, and each of them should include one and only 
one Therefore looking at the sets A such that \J A S Uj r(z, a))dz\ ^> a it is 
possible to identify p disjoint sets N k which possibly include the true This 
can be done by computing a discrete transform by evaluating R (z, a) on a 
suitable lattice L = {(xi, Ui),i = 1, . . . , N} by taking a discretization of the 
Laplacian operator, giving rise to a matrix P(cr) G 3?^ VxAr ' ) - the P-transform 
of the vector a - such that P(i,j, a) = R (xi + iyj). The p relative maxima 
of the absolute value of the P-transform are then computed as well as disjoint 
neighbors N k centered on them. Estimates of are obtained by 

averaging the (cp , & ) which belong to each N k ■ The name " transform" is 
justified by observing that to the vector a we associate the matrix P (direct 
transform), and to the matrix P we associate the vector whose components 
are 



dk = Cj£j — > a k , when a — » 



(inverse transform). 



2 The algorithm 

The method for estimating the unknown parameters p, {(cj,£j), ,j = 1, . . . ,p} 
outlined in the previous section is quite expensive and delicate from the nu- 
merical point of view. In this section we discuss the main issues to be ad- 
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dressed to implement the basic method and suggest a new approach which 
mimics the basic one giving rise to a fast and reliable algorithm. 

The computation of the P-transform is the most critical part of the whole 
procedure. There are many algorithms to compute (cj,£j) based on different 
approaches (see e.g. [121 E] for short reviews) which are useful in different 
applied contexts. If computational burden is the principal issue and the 
geometric structure of the unknown measure S(z) is simple, extremely fast 
algorithms based on the generalized orthogonality of Pade' polynomials can 
be used to compute (cj,£j) ( [T3| pg. 631-632], [6J). If clusters of poles can 
be expected it is better to solve the generalized eigenvalue problem e.g. as 
discussed in [19] and [12] where several advanced methods are presented or 
|14j . where the Hankel structure of the pencil P = [Ui(a), Uq(cl)] is taken into 
account to speed up the computation and QR factorization and QZ iteration 
are used as well as a suitable diagonal scaling of the pencil P, for achieving 
numerical stability. An even more expensive method is described in [23] 
where a total least squares approach is used taking into account the Hankel 
structure and the noise affecting the elements of P. A classical approach 
is given by Prony's method [20] which splits the problem in three parts by 
solving two linear least squares problems with Toeplitz and Vandermonde 
structure respectively and a polynomial rooting problem. Fast codes for all 
these sub-steps do exist [131 sect. 4. 6,4. 7] as well as total least squares [27] 
and structured total least squares algorithms [T7] . 

A further complication is due to the fact that for computing the P-transform 
R generalized eigenvalue problems have to be solved. An effective compro- 
mise between accuracy and speed of computation is given by the following 
procedure: 

• compute ( c j,£,j), j — l,---,n/2, by solving the generalized eigen- 
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value problem for the pencil P by one of the accurate methods quoted 
above. If the method described in [TJj is used the computational cost 
of this step is 0((f) 3 ) 

• select the generalized eigenvalues ^f 1 , j = l,...,p corresponding to 
the p largest values \Cj\ where p is an upper bound of p, p < | 

• for each pseudosample gS r ' compute the coefficients of the polynomial 

p{z) = det[£/ 1 (a (r) ) -zU (a^)} 

by the first step of Prony's method. This requires 0((^) 2 ) flops because 
of the Hankel structure of Uo(a^) 

(r) 

• to compute Q , j = 1, . . .p, apply a fast iteration such as e.g. Laguerre 
method [30] to the polynomial p(z), taking as initial values Q°\ j = 
1, . . . ,p. Usually it converges in few iterations, therefore it costs 0(|p) 
flops or less if the Horner scheme to compute the polynomial derivatives 
is implemented through the fast Fourier transform [25] . 

• to compute cp , j = 1, . . .p, apply the third step of Prony's method, 

(r) 

forming the Vandermonde matrix of £ ■ , j = 1 , . . . p and solving a 
least squares problem e.g. by LSQR method [2T], which is a good 
compromise between accuracy and computational speed. Usually a 
few iterations are sufficient, therefore it costs flops. 

• The last step for computing the P-transform consists in evaluating the 
summation in and then computing a discrete Laplacian. This can 
be the most expensive part of the procedure because the summation 
must be computed for each Zj = (xi,yi), i = 1, . . . ,N of the lattice 
L. Therefore we need 0(RpN 2 ) flops. However we notice that we only 



Complex exponentials approximation 



12 



need an estimate of the local maxima of the absolute value of the P- 
transform. These are likely to be close to the centroids of poles clusters 
and their value is a monotonic increasing function of the corresponding 

(r) 

\Cj \. A fast way to estimate them consists then in applying a clustering 
method, such e.g. k-means, to the Rp vectors of M 3 

m?\*#\\c? ) \],r = l,---,R,3 = l,---,P 
looking for p clusters. The clustering algorithm can be initialized by 

computed in the first two steps. We then compute 

sue*.*) = i A (EK r) ^-^)) ( ? ) 

for z G Nk, k = 1, . . . ,p where iV& is a small regular mesh of points 
with size 5, centered on the centroid of the k— th cluster. Finally, after 
theorem [1], we select the p < p clusters such that 



S c n<R (z h ,a)5' 

z h £N k 



> Ka 



where K > 1. Estimates (cj,£j), j = 1, ... ,p of (cj,£j), j = l,...,p 
are then obtained by averaging the (c^ , Q ) which belong to the se- 
lected clusters. The computational cost of the clustering algorithm and 
the computation of S° R (z, a) is 0(R(p) 2 ) flops. 

Summing up we can solve the CEAP problem in 0((|) 3 ) + 0(-R(|) 2 ) flops. 
In most applications R < n is enough to get good results, therefore 0(n 3 ) 
flops is a reasonable upper bound for solving the problem in most cases 
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(fast method). In a few particularly difficult problems the computation of 
Q r \ j = l,...p,r = 1,...,R is better performed by the same accurate 
methods used for r = 0. In these cases the computational burden becomes 
0(n 4 ) (slow method). 

3 Numerical experiments 

In order to appreciate the behavior of the proposed algorithm in practice, four 
examples on real and synthetic data are presented. The first one copes with 
the classic problem of quantification of Nuclear Magnetic Resonance spec- 
tra (see e.g. [28j [29]) which is usually solved by ad hoc methods requiring 
visual inspection by the operator. The second example is an interpolation- 
extrapolation problem on a synthetic time series used in the 2004 Competi- 
tion on Artificial Time Series, organized in the framework of the European 
Neural Network Society [8J. Comparisons with the results obtained by par- 
ticipants are provided. The third example is an interpolation problem of a 
real acoustic signal with a missing fragment. The aim here is to reconstruct 
the missing part in order to make the reconstructed signal to sound as the 
original. The last example is a shape from moments reconstruction problem. 
It turns out that the identification of a polygonal region in the plane from 
its complex moments can be formulated as a specific CEIP (9j [14] . Synthetic 
data sets are generated and the results are compared with those obtained in 
[T2] when the number of the polygon vertices is known. Moreover the case 
when the number of vertices is unknown is also addressed. 

We notice that several hyperparameters have to be chosen e.g. the upper 
bound p of p, the number R of pseudosamples, the variance a' 2 of {i'jj. } and 
the constant K. Moreover one of the most critical hyperparameter is the 
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number n of data points, as noted in the Remark in section 2. Usually we 
can only cut some data in order to reduce the noise. In order to select good 
hyperparameters a performance criterion is chosen and the method is applied 
for several values of the hyperparameters in suitable intervals. Then those 
that give the best value of the performance criterion are used to compute the 
final results. The performance criterion is problem dependent. However a 
standard residual analysis provides usually a good basis to build up a good 
criterion. In the following the number of residuals whose absolute value is 
larger than a is used as performance criterion. 

3.1 NMR spectroscopy 

In the top part of figUJa l H Magnetic Resonance absorption spectrum from 
a diluted aqueous solution of the tripeptide Glutathione in its reduced form 
(GSH) is shown. It is computed by taking the discrete Fourier transform 
(DFT) of a Free Induction Decay (FID) signal of 4096 data points. In ideal 
conditions the physical model for the FID is a linear combination with pos- 
itive weights of complex exponentials. The absorption spectrum is the real 
part of the Fourier transform of the FID. It turns out that it is given by a 
linear combination of Lorentian functions. The spectroscopist is interested 
in estimating the parameters characterizing these Lorentian lines, namely 
their modes, widths and relative areas which are simply related respectively 
to the argument of the complex exponentials modeling the FID, to their ab- 
solute value and to weights associated to them. In a real experimental setup 
the ideal conditions are no longer true. Standard methods, implemented on 
most spectrometers, fit each peak of the absorption spectrum with a Loren- 
tian function. If the peaks are close, a very ill conditioned non linear problem 
has to be solved which can heavily depend on the interactive choices of the 
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spectroscopist to initialize the procedure. A better alternative is provided by 
time-domain methods (see e.g. [28, 29J) which exploit the fact that the FID 
can be modeled by complex exponentials. The problem can still be very ill 
conditioned. However, if the SNR is large enough, reasonable estimates of 
the quantities of interest can be obtained by solving the CEAP problem for 
the FID by the proposed method, which provides a global stable solution 
and no longer requires critic interaction with the spectroscopist. 

The analysis is performed in the interval of the spectrum marked by the 
rectangle in the top part of figJU A quadruplet whose areas are in the ratios 
1 : 3 : 3 : 1 is the theoretical reference. The frequencies are measured in parts 
per million (ppm). The standard interactive procedure provides an estimate 
of the areas of the four peaks such that their ratios are 1.02 : 3.21 : 3.15 : 1. 
In order to apply the proposed method the FID is first filtered by a pass- 
band Fourier filter [H [3]. In the middle-bottom part of the same figure, the 
absorption spectrum of the filtered FID is shown. When the main peaks of the 
spectrum are clustered and the clusters are well separated, it is in fact possible 
to split the analysis by filtering out from the FID all the frequencies but those 
belonging to a given interval [18]. The filtered FID is given by only 300 data 
points and the proposed method was applied to solve the CEAP for it. The 
results are shown on the middle-top part of figlH Four estimated Lorentian 
lines marked 1 — 4 are plotted and their areas are reported in the legend as 
well their modes in ppm. The ratios of the areas are 1.07 : 2.98 : 2.92 : 1 
which compare favorably with those estimated interactively. On the bottom 
part of the figure the weighted sum of the four Lorentian lines is plotted. 
The agreement with the zoomed absorption spectrum on the middle-bottom 
part is quite good. 
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3.2 Time series interpolation and extrapolation 

In order to apply the proposed method to solve extrapolation problems it is 
enough to solve a CEAP for the measured data and then evaluate the model 
on the extrapolation abscissas. To solve an interpolation problem we notice 
that, in the noiseless case, we can consider the segments of data before and 
after the missed segment as produced by the same model (pQ) for a set of in- 
dices k displaced by a fixed quantity q. It is easy to show that the generalized 
eigenvalues and eigenvectors are invariant for such a displacement . Therefore 
we can solve two separate CEAPs for the observed segments, and apply the 
proposed method to the pooled generalized eigenvalues and eigenvectors. We 

(r) 

need only to modify the Vandermonde matrix for computing Cj in the last 
step to take into account the gap in the observations. Assuming that each 
segment has n observations we have = V^a, where 



1 



1 



1 



(r) 



(r) 



(r) 



V 




(en 



/tW\n-l 
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The interpolated values are then obtained by 



tn 



tn 
S2 



iimt 



Yr. V 



£n+q— 1 
SI 



tn+q— 1 
S2 



Sp 



The first example in this class of problems copes with a time series of 5000 
samples with 100 missing values at times 981 - 1000, 1981 - 2000, 2981 - 
3000, 3981 —4000, 4981 — 5000. Therefore we want to solve four interpolation 
and one extrapolation problems. As the data are synthetic the truth is known 
and the results obtained by 17 methods are reported in \8\ where the mean 
squared error (MSE) for the interpolation problems and the interpolation + 
extrapolation problems are reported. It can be argued that the MSE is not 
the best discrepancy measure for this data set because a fit with a smoothing 
cubic spline gives results better than all of the 17 quoted methods for the 
interpolation + extrapolation problems and better than 15 of them for the 
extrapolation problem. Therefore we want to see how much the proposed 
method is able to improve on the solution provided by the cubic spline. We 
then apply the method to the residual obtained by subtracting the smoothing 
spline from the data. In figj2] top left the full time series with missing data 
is plotted. The other plots show the true values and the reconstructed ones 
on each missed data interval. The MSE 100 = 270 and MSE S0 = 195 have 
to be compared with MSE W0 = 408 and MSE 8Q = 222 which are the best 
results obtained in [8] by two different methods among the 17 considered. 
The second example is illustrated in fig. [3j An audio signal, corresponding 
to a ringing bell, made up of 50000 samples at 11025 Hz is considered. The 
first 10000 samples are plotted in the top left part of the figure. A fragment 
of 1000 samples are put to zero as shown in the top right part of the figure. 
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The method is applied to interpolate the missing fragment. Two data sets 
made up of 300 samples each before and after the missing data are considered 
as shown in the middle left part of the figure. The results are shown in the 
middle right part of the figure where 50 missed data are plotted superim- 
posed to the interpolated values. Even if the fit is not impressive most of 
the main spectral characteristics of the signal are well reproduced as shown 
in the bottom part of the figure where the Fourier spectrum of the original 
complete signal is plotted on the left, and the Fourier spectrum of the com- 
plete signal with the missing fragment replaced by the interpolated values 
is shown on the right. The sound produced by the reconstructed signal is 
almost undistinguishable from the original one. 
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5.74e-2 
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4. 5 le-4 


4.27e-4 




le-5 


4.59e-5 


4.28e-5 



Table 1: For the star shaped polygon and the C shaped polygon, the RMSE 
averaged over all the vertices obtained in [32] when p is known and equal to 
the true value for a = le -3 , le -4 , le -5 is reported in the third column. In the 
fourth column the corresponding RMSE obtained by the proposed procedure 
is reported. 
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3.3 Shape from moments problems 

In [9], |TJ] it was shown that the p vertices £1, . . . ,£ p of a non degenerate 
polygon V and its complex moments k — 0, 1, . . . , 2p — 1 are related by 



assuming that the vertices are arranged in counterclockwise direction in the 
order of increasing index and extending the indexing of the £ 3 - cyclically 
so that £ = £ p , £i = Cp+i- Therefore to identify the polygon (i.e. its 
vertices) from its complex moments is equivalent to solve a CEIP for the data 
Sk = k(k — l)fik- In [12] several methods for solving this specific problem were 
compared on two different polygons for o = 10~ 3 , 10~ 4 , 10~ 5 by a simulation 
experiment involving N = 100 independent replications and n — 101 noisy 
moments. For comparison, in Table 1 the results obtained by the proposed 
method and the best among those reported in [12} Tables IV, VIII, bold 
figures] are reported. The root mean squared error (RMSE) averaged over 
all parameters is computed by 



As the best results were obtained in [12] by using GPOF method ([IE]) but 
in one case, also in the proposed procedure the solution of the generalized 
eigenvalue problem (step 1) was obtained by GPOF with the same setup 
used in [12]. Therefore p = p is assumed to be known, as in [12], and 
the P-transform was not computed because all the p estimated clusters were 








where 
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retained. Moreover this was the only example where GPOF was used also for 
computing j = 1, . . .p, r = 1, . . . , R (slow method). An improvement 
can be noticed in all cases. In the first column of fig. H] the estimated £j 
for cr = 10 -4 and for the considered polygons are plotted. We notice that in 
some vertices, the £ 3 - are so concentrated that they coincide with one point at 
the used resolution. Next we use the full fast proposed procedure assuming 
not to know p and putting p = n/2. The RMSE averaged over all parameters 
£j and the mean and standard deviation of p are reported in Table 2. In the 
second column of fig. H]the estimated £j for cr = 10 -4 and for the considered 
polygons are plotted. 





cr 


RMSE 


p mean 


p s.d. 


Star shape 


le-3 


1.07e-l 


8 


3 




le-4 


7.62e-2 


10 


3 




le-5 


2.98e-2 


11 


4 


C shape 


le-3 


4.13c-2 


9 


1 




le-4 


2.94e-2 


9 


2 




le-5 


2.72e-2 


9 


2 



Table 2: For the star shaped polygon and the C shaped polygon, the RMSE 
averaged over all the vertices obtained in the case of p unknown for o = 
le~ 3 , le -4 , le -5 is reported in the third column. In the fourth and fifth 
columns the estimated mean and s.d. of p are reported. 

4 Conclusion 

A new approach for solving a classic inverse ill-posed problem is discussed 
from the computational point of view. The approach is a perturbative one, 
therefore it exploits the information generated by solving several closed prob- 
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lems by any standard method which best suits the user's needs such e.g. nu- 
merical quality and/or computational speed. The final results are obtained 
by an "averaging" step, hence they are quite stable with respect to noise 
and, provided that some hyperparameters are properly selected, sensitivity 
is also preserved, allowing to retrieve features of the signal which are masked 
by the noise. Several numerical examples are presented which confirm these 
practical abilities often improving on the results given by known methods. 
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3. 885 3.772 3.659 3.546 3.432 3.319 3.206 3.Q92 





01 ppm = 3.028, area = 0. 155 

02 ppm = 3.026, area = 0.430 

03 ppm = 3.024, area = 0.421 

04 ppm - 3.022, area = 0.144 



2.866 2.753 



!.051 3.044 3.036 3.029 3.022 3.014 3.007 2.999 2.992 2.984 2.977 




3.051 3.044 3.036 3.029 3.022 3.014 3.007 2.999 2.992 2.984 2.977 




3.051 3.044 3.036 3.029 3.022 3.014 3.007 2.999 2.992 2.984 2.977 

Figure 1: Quantification of NMR spectra. Top: the NMR Fourier spectrum 
and the ROI. Middle-top: the estimated Lorentian lines in the ROI and their 
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Figure 2: Top left: time series with five missing intervals. True values on each 
interval (-); interpolated values (+). Total MSE on the first four intervals = 
195. Total MSE on the five intervals = 270. 
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Figure 3: Top left: a segment of an audio signal; top right: the part missed 
is shown; middle left: the data used to interpolate; middle right: a fraction of 
the interpolated data (+) are superimposed to the unknown ones (-); bottom 
left: Fourier spectrum of the missed part; bottom right: Fourier spectrum of 
the reconstructed data. 
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Figure 4: Estimates of the vertices of the star shaped and C shaped polygons 
obtained by the proposed method on N = 100 replications with o = l.e -4 . 
Left: the true number of vertices is known. Right: the true number of vertices 
is unknown. 



